MAIN: Multi-Attention Instance Network for video segmentation

نویسندگان

چکیده

Instance-level video segmentation requires a solid integration of spatial and temporal information. However, current methods rely mostly on domain-specific information (online learning) to produce accurate instance-level segmentations. We propose novel approach that relies exclusively the generic spatio-temporal attention cues. Our strategy, named Multi-Attention Instance Network (MAIN), overcomes challenging scenarios over arbitrary videos without modeling sequence- or instance-specific knowledge. design MAIN segment multiple instances in single forward pass, optimize it with loss function favors class agnostic predictions assigns penalties. achieve state-of-the-art performance Youtube-VOS dataset benchmark, improving unseen Jaccard F-Metric by 6.8% 12.7% respectively, while operating at real-time (30.3 FPS).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multi-scale Multiple Instance Video Description Network

Generating natural language descriptions for in-thewild videos is a challenging task. Most state-of-the-art methods for solving this problem borrow existing deep convolutional neural network (CNN) architectures (Alexnet, Googlenet) to extract a visual representation of the input video. However, these deep CNN architectures are designed for single-label centered-positioned object classification....

متن کامل

Path Aggregation Network for Instance Segmentation

The way that information propagates in neural networks is of great importance. In this paper, we propose Path Aggregation Network (PANet) aiming at boosting information flow in proposal-based instance segmentation framework. Specifically, we enhance the entire feature hierarchy with accurate localization signals in lower layers by bottom-up path augmentation, which shortens the information path...

متن کامل

MaskRNN: Instance Level Video Object Segmentation

Instance level video object segmentation is an important technique for video editing and compression. To capture the temporal coherence, in this paper, we develop MaskRNN, a recurrent neural net approach which fuses in each frame the output of two deep nets for each object instance — a binary segmentation net providing a mask and a localization net providing a bounding box. Due to the recurrent...

متن کامل

Dynamic Video Segmentation Network

In this paper, we present a detailed design of dynamic video segmentation network (DVSNet) for fast and efficient semantic video segmentation. DVSNet consists of two convolutional neural networks: a segmentation network and a flow network. The former generates highly accurate semantic segmentations, but is deeper and slower. The latter is much faster than the former, but its output requires fur...

متن کامل

Multi-instance Methods for Partially Supervised Image Segmentation

In this paper, we propose a new partially supervised multiclass image segmentation algorithm. We focus on the multi-class, singlelabel setup, where each image is assigned one of multiple classes. We formulate the problem of image segmentation as a multi-instance task on a given set of overlapping candidate segments. Using these candidate segments, we solve the multi-instance, multi-class proble...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computer Vision and Image Understanding

سال: 2021

ISSN: ['1090-235X', '1077-3142']

DOI: https://doi.org/10.1016/j.cviu.2021.103240